IBM and University of Washington Launch Toucan, a Breakthrough AI Tool-Calling Dataset
IBM and the University of Washington have unveiled Toucan, the largest public repository of real-world tool-using examples for AI systems. The dataset contains 1.5 million practical tasks spanning 2,000+ digital tools, addressing a critical bottleneck in AI agent development.
The researchers employed a multi-stage validation process, using five large language models to generate task plans and additional models to simulate execution. Metadata was sourced from GitHub and Smithery.ai, with non-functional tools systematically filtered out. Each scenario underwent quality grading for difficulty and reliability.
Initial tests demonstrate Toucan's potential to bridge the gap between conversational AI and functional digital assistants. The toolkit enables complex operations like report analysis, meeting scheduling, and document summarization—capabilities that could eventually integrate with blockchain analytics platforms and crypto trading systems.